home *** CD-ROM | disk | FTP | other *** search
Text File | 1995-06-22 | 46.2 KB | 1,179 lines | [TEXT/ttxt] |
-
-
-
-
-
- 2-D rasterization.
- --------------------
-
-
-
-
-
-
- Re graphics: A picture is worth 10K words -
- but only those to describe the picture.
- Hardly any sets of 10K words can be adequately
- described with pictures.
-
-
-
-
-
-
- Rasterization, the term itself refers to the technology we are using to
- represent images: turning it into mosaic of small square pixel cells set
- to different colours. The algorithms performing just that are covered in
- this section.
-
- A dot.
- ------
- Rasterizing dot is very straightforward, I am not even sure it is
- warranted to call it rasterization at all. All it involves is just setting
- some location in the screen memory to particular colour. Let's recall
- that the very first location of the colourmap describes top rightmost
- pixel, that determines the system of references which is convenient in
- this case with X axis directed to the right and Y axis directed downward.
-
-
- HW_X_SIZE
-
- (0,0)+-+-+-+- ... -+-------> X
- | | | |
- +-+-+ +
- |
- HW_Y_SIZE ...
-
- |
- +-+- +-+
- | | |*|<------y
- +-+-+ +-+
- | ^
- | |
- V |
- Y x
-
-
- pic 3.1
-
-
- Assuming that each pixel is represented by one byte, and that the dimensions
- of the output surface is HW_SCREEN_X_SIZE by HW_SCREEN_Y_SIZE it is the
- y*HW_SCREEN_X_SIZE+x location we have to access in order to plot a dot
- at (x,y). Here's some code to do just that:
-
-
- unsigned char *G_buffer; /* colourmap- offscreen buffer */
-
- void G_dot(int x,int y,unsigned char colour)
- {
- G_buffer[y*HW_SCREEN_X_SIZE+x]=colour;
- }
-
- A line.
- -------
- Line rasterization is a very important peace of code for the library. It
- would also be used for polygon rendering, and few other routines. What we
- have to do is set to some specified colour all dots on the line's path.
- We can easily calculate coordinates of all dots belonging to this line:
-
- * (xend,yend) ---
- / |
- * (x,y) dy
- / |
- / |y |
- (xstart,ystart) *-----+ ---------
- x
- |
- | |
- |- -dx- -|
-
- pic 3.2
-
- Remembering a little bit of high (high?) school geometry, it can be
- seen that if:
- dx = xend-xstart
- dy = yend-ystart
- then:
- x dx x * dy
- --- = ---- and y = --------
- y dy dx
-
- Since we want to set all pixels along the path we just have to take
- all Xs in the range of [xstart, xend] and calculate respective Y. There
- is a bit of a catch, we would have only as much as xend-xstart
- calculated coordinates. But if the Y range was actually longer the path
- we have just found won't be continuous pic 3.3. In this case we should have
- been calculating coordinates taking values from the longer range
- [ystart, yend] and getting respective X, pic 3.4 as:
-
- y * dx
- x = --------
- dy
-
- ....* ....*
- ..... ....*
- ...*. ...*.
- ..... ...*.
- ..*.. yrange ..*.. yrange
- ..... ..*..
- .*... .*...
- ..... .*...
- *.... *....
-
- xrange xrange
-
- pic 3.3 pic 3.4
-
-
- What we just did is quite logical and easily programmable algorithm,
- but... it is not that commonly used, the reason has to do with the way
- we did the calculation, it involved division, and that is by far the
- slowest operation any computer does. What we would do instead doesn't
- involve a single division. It is called a Bresenham's integer based
- algorithm (Bresenham, 1965) and here we find all points using few
- logical operations and integer addition. The idea is to find a bigger
- range among [xstart,xend] and [ystart,yend] go along points in it and
- have a variable saying when it is time to advance in the other range.
- Let's consider what is happening at some stage in line rasterization,
- let's assume X is a longer range (dx>dy) and that with the line we are
- rendering xstart<xend and ystart<yend:
-
- |
- | | H(x+1,y+1)
- -+---|---*-
- | |h/
- | | | * I(x+1,real_y)
- | /|
- + - - - - + - - - - + -
- | / |
- | | / | |l |
- / |
- -*---|---*-L(x+1,y)
- |P(x,y) |
-
-
- pic 3.5
-
-
- Pic 3.5 shows the situation where we just rendered a pixel P (Previous) at
- coordinates (x,y) and now have to make a decision where to go to next, to
- pixel L(x+1,y) (Lower) or H(x+1,y+1) (Higher). points P,H,L actually represent
- middles of respective pixels. Since actual line would pass somewhere in the
- middle between those two points we must plot the one whose centre is closer
- to the intersection point I(x+1,real_y). this can be measured by comparing
- segments h and l, resulted from the intersection: remembering dependency
- used in the previous method we can find that:
-
- dy * (x+1)
- real_y = ------------
- dx
- and:
- dy * (x+1)
- h = y+1-real_y => h = y+1 - ------------
- dx
-
- dy * (x+1)
- l = real_y-y => l = ------------ - y
- dx
- Now, we are interested in comparing l and h:
-
- dy * (x+1)
- l - h = 2 ------------ - 2y-1
- dx
-
-
- So, if l-h>0 it means that l>h the intersection point L is closer to point
- H and the later should be rendered, otherwise L should be rendered. let's
- multiply both sides by dx:
-
- dx(l - h)= 2dyx + 2dy - 2dxy - dx
-
- Now, since dx assumed bigger then 0, signs of (l-h) and dx(l-h) would be
- the same. Let's call dx(l-h) as d and find it's sign and value at some step
- i and next step i+1:
-
- d = 2dyx -2dxy + 2dy - dx
- i i-1 i-1
-
- d = 2dyx -2dxy + 2dy - dx
- i+1 i i
-
- We can also find the initial value d, in the very first point since before
- that x==0 and y==0:
-
- d = 2dy-dx
- 1
- -------------
-
- Since sign of d determines which point to move next (H or L) lets find
- value of d at some step i assuming we know it's value at i-1 from
- previous calculations:
-
- d - d = 2dyx - 2dxy - 2dyx + 2dxy
- i+1 i i i i-1 i-1
-
- d - d = 2dy(x - x ) - 2dx(y - y )
- i+1 i i i-1 i i-1
-
- Depending on the decision taken in the previous point value of d in the
- next one would be either:
-
- when H was chosen since x - x =1 and y - y =1
- i i-1 i i-1
-
- d - d = 2dy - 2dx
- i+1 i
-
- d = d + 2dy - 2dx
- i+1 i
- ---------------------
-
- or:
-
- when L was chosen since x - x =1 and y - y =0
- i i-1 i i-1
-
- d -d = 2dy
- i+1 i
- d =d + 2dy
- i+1 i
- --------------
-
- This may seem a little complicated but the implementation code certainly
- doesn't look this way. We just have to find the initial value of d, and
- then in the loop depending on it's sign add to it either 2dy-2dx, or
- just 2dy. since those are constants, they can be calculated before the
- actual loop. In the proof above we have assumed xstart<yend ystrat<yend,
- however, in real life we can't guarantee that. The way we can take care
- of it, is just changing increments to decrements when above assumptions
- don't hold. We also have to remember that in the loop when L point is
- picked it is only value belonging to the bigger range which is incremented
- (decremented) whereas when picking H point both X and Y have to be changed
- since that is when we are advancing along both axes.
-
-
- void G_line(int x,int y,int x2,int y2,unsigned char colour)
- {
- int dx,dy,long_d,short_d;
- int d,add_dh,add_dl;
- register int inc_xh,inc_yh,inc_xl,inc_yl;
- register int i;
-
- dx=x2-x; dy=y2-y; /* ranges */
-
- if(dx<0){dx=-dx; inc_xh=-1; inc_xl=-1;} /* making sure dx and dy >0 */
- else { inc_xh=1; inc_xl=1; } /* adjusting increments */
- if(dy<0){dy=-dy; inc_yh=-1; inc_yl=-1;}
- else { inc_yh=1; inc_yl=1; }
-
- if(dx>dy){long_d=dx; short_d=dy; inc_yl=0;}/* long range,&making sure either */
- else {long_d=dy; short_d=dx; inc_xl=0;}/* x or y is changed in L case */
-
- d=2*short_d-long_d; /* initial value of d */
- add_dl=2*short_d; /* d adjustment for H case */
- add_dh=2*short_d-2*long_d; /* d adjustment for L case */
-
- for(i=0;i<=long_d;i++) /* for all points in longer range */
- {
- G_dot(x,y,colour); /* rendering */
-
- if(d>=0){x+=inc_xh; y+=inc_yh; d+=add_dh;}/* previous point was H type */
- else {x+=inc_xl; y+=inc_yl; d+=add_dl;}/* previous point was L type */
- }
- }
-
-
- As you can see this algorithm doesn't involve divisions at all. It does
- have multiplication by two, but almost any modern compiler would optimize
- it to 1 bit left shift - a very cheap operation. Or if we don't have faith
- in the compiler we can do it ourselves. Lets optimize this code a little
- bit. Whenever trying to optimize something we first have to go after
- performance of the code in the loop since it is something being done
- over and over. In this source above we can see the function call to G_dot,
- let's remember what is inside there,.... oh, it is y*HW_SCREEN_X_SIZE+x a
- multiplication... an operation similar in length to division which
- we just spent so much effort to avoid. Lets think back to the
- representation of the colourmap, if we just rendered a point and
- now want to render another one next to it how their addresses in the
- colourmap would be different? pic 3.6
-
- +-+-+-
- A|*->|B
- +|+-+-
- C|V| |
- +-+-
- | |
-
- pic 3.6
-
- Well, if it is along X axis that we want to advance we just have to add 1,
- to the address of pixel A to get to pixel B, if we want to advance along
- Y we have to add to the address of A number of bytes in the colourmap
- separating A and C. this number is exactly HW_SCREEN_X_SIZE that is length
- of one horizontal line of pixels in memory. Now, using that, let's try
- instead of coordinates (x,y) to calculate it's address in the colourmap:
-
-
- void G_line(int x,int y,int x2,int y2,unsigned char colour)
- {
- int dx,dy,long_d,short_d;
- int d,add_dh,add_dl;
- int inc_xh,inc_yh,inc_xl,inc_yl;
- register int inc_ah,inc_al;
- register int i;
- register unsigned char *adr=G_buffer;
-
- dx=x2-x; dy=y2-y; /* ranges */
-
- if(dx<0){dx=-dx; inc_xh=-1; inc_xl=-1;} /* making sure dx and dy >0 */
- else { inc_xh=1; inc_xl=1; } /* adjusting increments */
- if(dy<0){dy=-dy; inc_yh=-HW_SCREEN_X_SIZE;
- inc_yl=-HW_SCREEN_X_SIZE;}/* to get to the neighboring */
- else { inc_yh= HW_SCREEN_X_SIZE; /* point along Y have to add */
- inc_yl= HW_SCREEN_X_SIZE;}/* or subtract this */
-
- if(dx>dy){long_d=dx; short_d=dy; inc_yl=0;}/* long range,&making sure either */
- else {long_d=dy; short_d=dx; inc_xl=0;}/* x or y is changed in L case */
-
- inc_ah=inc_xh+inc_yh;
- inc_al=inc_xl+inc_yl; /* increments for point adress */
- adr+=y*HW_SCREEN_X_SIZE+x; /* address of first point */
-
- d=2*short_d-long_d; /* initial value of d */
- add_dl=2*short_d; /* d adjustment for H case */
- add_dh=2*short_d-2*long_d; /* d adjustment for L case */
-
- for(i=0;i<=long_d;i++) /* for all points in longer range */
- {
- *adr=colour; /* rendering */
-
- if(d>=0){adr+=inc_ah; d+=add_dh;} /* previous point was H type */
- else {adr+=inc_al; d+=add_dl;} /* previous point was L type */
- }
- }
-
-
- We can see from this code that although the complexity of the preliminary
- part of the algorithm went up a bit, the code inside the loop is much shorter
- and simpler. There is just one thing left to add in order to make it all
- completely usable - clipping that is making sure our drawing doesn't go
- outside the physical boundaries of the colourmap, but this is covered in
- the next chapter.
-
-
-
- Ambient polygons
- ----------------
- What we have to do is to fill with some colour inside the polygon's edges.
- The easiest and perhaps fastest way to do it is by using "scan line"
- algorithm. The idea behind it is to convert a polygon into a collection
- of usually horizontal lines and then render them one at a time. The lines
- have to be horizontal only because the most common colourmap would have
- horizontal lines contingent in memory which automatically allows to render
- them faster then vertical lines since increment by 1 is usually faster
- then addition. There is one inherited limitation to the algorithm, because
- at each Y heights there would be just one horizontal line only concave
- polygons can be rendered correctly. It can be seen that non concave polygons
- might need two or more lines sometimes pic 3.7 (that's by the way, in case
- you wondered, determines the difference in definitions between concave and
- non-concave polygons):
-
- *\
- | \ /*
- y|---\ /-|
- | \/ |
- *--------*
-
- pic 3.7
-
- How can we represent the collection of horizontal lines? obviously we
- just need to know at which Y heights it is, X coordinate of the point
- where it starts and another X coordinate of the point where it ends.
- So lets have one "start" and one "end" location for every Y possible
- on the screen. Since every line is limited by two points each belonging
- to certain edge, we can take one edge at a time find all points belonging
- to it and for each of them change the "start" and "end" at particular Y
- heights. So that at the end we would have coordinates of all scan lines
- in the polygon pic 3.8:
-
- start end
- +-----+---+ 1 2 3 4 5 6 7 8
- | 1 | 5 |- - - - - -> *------\
- | 2 | 8 | \------------*
- | 2 | 7 |- - - - - - -> \----------/
- | 3 | 7 | \------/
- | 4 | 6 |- - - - - - - -> *----*
- +-----+---+
- pic 3.8
-
- Initially we would put the biggest possible value in all locations of
- "start" array and the smallest possible in the locations of "end" array.
- (Those are by the way are defined in <limits.h> and it is not a bad
- practice to use them). Each time we've calculated the point on the edge
- we have to go to "start" and "end" locations at Y heights and if X of
- the point less then what we have in "start" write it there. Similar to
- that if this same X is bigger then value in "end" location we have to
- put it there too. Because of the way the arrays were initialized first
- point calculated for every Y height would change both "start" and "end"
- location. Let's look at the code to find Xs for each edge, so called
- scanning function:
-
- void GI_scan(int *edges,int dimension,int skip)
- {
- signed_32_bit cur_v[C_MAX_DIMENSIONS]; /* initial values for Z+ dims */
- signed_32_bit inc_v[C_MAX_DIMENSIONS]; /* increment for Z+ dimensions */
- signed_32_bit tmp;
- int dx,dy,long_d,short_d;
- register int d,add_dh,add_dl;
- register int inc_xh,inc_yh,inc_xl,inc_yl;
- int x,y,i,j;
- int *v1,*v2; /* first and second verteces */
-
- v1=edges; edges+=skip; v2=edges; /* length ints in each */
-
- if(C_line_y_clipping(&v1,&v2,dimension)) /* vertical clipping */
- {
- dx=*v2++; dy=*v2++; /* extracting 2-D coordinates */
- x=*v1++; y=*v1++; /* v2/v1 point remaining dim-2 */
- dimension-=2;
-
- if(y<G_miny) G_miny=y;
- if(y>G_maxy) G_maxy=y;
- if(dy<G_miny) G_miny=dy;
- if(dy>G_maxy) G_maxy=dy; /* updating vertical size */
-
- dx-=x; dy-=y; /* ranges */
-
- if(dx<0){dx=-dx; inc_xh=-1; inc_xl=-1;} /* making sure dx and dy >0 */
- else { inc_xh=1; inc_xl=1; } /* adjusting increments */
- if(dy<0){dy=-dy; inc_yh=-1; inc_yl=-1;}
- else { inc_yh=1; inc_yl=1; }
-
- if(dx>dy){long_d=dx;short_d=dy;inc_yl=0;} /* long range,&making sure either */
- else {long_d=dy;short_d=dx;inc_xl=0;} /* x or y is changed in L case */
-
- d=2*short_d-long_d; /* initial value of d */
- add_dl=2*short_d; /* d adjustment for H case */
- add_dh=2*short_d-2*long_d; /* d adjustment for L case */
-
- for(i=0;i<dimension;i++) /* for all remaining dimensions */
- {
- tmp=v1[i]; tmp<<=G_P; cur_v[i]=tmp; /* f.p. start value */
- tmp=v2[i]-v1[i]; tmp<<=G_P; /* f.p. increment */
- if(long_d>0) inc_v[i]=tmp/long_d; /* if needed (non 0 lines) */
- }
-
- for(i=0;i<=long_d;i++) /* for all points in longer range */
- {
- if(x<G_start[y]) /* further then rightmost */
- {
- G_start[y]=x; /* the begining of scan line */
- for(j=0;j<dimension;j++)
- G_rest_start[j][y]=cur_v[j]; /* all other dimensions */
- }
-
- if(G_end[y]<x) /* further the leftmost */
- {
- G_end[y]=x; /* the end of scan line */
- for(j=0;j<dimension;j++)
- G_rest_end[j][y]=cur_v[j]; /* and for other dimension */
- }
-
- if(d>=0){x+=inc_xh;y+=inc_yh;d+=add_dh;} /* previous dot was H type */
- else {x+=inc_xl;y+=inc_yl;d+=add_dl;} /* previous dot was L type */
- for(j=0;j<dimension;j++)
- cur_v[j]+=inc_v[j]; /* for all other dimensions */
- }
- }
- }
-
- As you can see this edge scanning function doesn't have much differences
- with our most basic line rendering code. The only thing we are updating the
- vertical size of polygon so that after all edges would go through this
- function we would have the correct vertical dimensions in G_miny and
- G_maxy. There is another difference: scan function is designed to process,
- other dimensions, that is it would interpolate X and similar to that
- other values specified in vertices, finding them for all Ys in range.
- We would need that when adding interpolative shading feature. As to the
- ambient polygon rendering, After scanning is over we would just have
- to render all horizontal scan lines in the range [G_miny,G_maxy]:
-
- void G_ambient_polygon(int *edges,int length,unsigned char colour)
- {
- int new_edges[G_MAX_TUPLES]; /* verticaly clipped polygon */
- int new_length; /* although no edges there yet */
- register unsigned char *l_adr,*adr;
- register int beg,end,i;
-
- GI_boarder_array_init(); /* initializing the arrays */
-
- new_length=C_polygon_x_clipping(edges,new_edges,2,length);
-
- for(i=0;i<new_length;i++)
- GI_scan(&new_edges[i*2],2,2); /* Searching polygon boarders */
-
- if(G_miny<=G_maxy) /* nothing to do? */
- {
- l_adr=G_buffer+G_miny*HW_SCREEN_X_SIZE; /* addr of 1st byte of 1st line */
-
- for(; G_miny<=G_maxy; G_miny++,
- l_adr+=HW_SCREEN_X_SIZE) /* rendering all lines */
- {
- adr=l_adr+(beg=G_start[G_miny]); /* addr of the current line */
- end=G_end[G_miny]; /* ends here */
-
- for(;beg<=end;beg++) *adr++=colour; /* rendering single line */
- }
- }
- }
-
- As to the optimization it doesn't appear we can do a lot without
- leaving comfortable boundaries of C. On the other hand when there's
- really nothing else we can do to speed things up, when all manuals
- are read and all friends are out of ideas let's do it, let's go
- for assembly. I don't think rewriting all functions this way is
- appealing. Moreover only rewriting the real execution bottleneck
- would give considerable speed increase. That's, perhaps, one of
- the reason why modern compilers have such a nice feature as inline
- assembly, allowing to add low-level code directly into C source.
-
- Looking at the code above we can see this line rendering loop. This is
- an obvious candidate, since it executes fairly often and does quite a
- lot of iterations, especially compared with surrounding code.
-
- Different C compilers have different provision for inline assembly.
- GNU C compiler has quite powerful syntax allowing to specify assembly
- code. It is:
-
- asm("assembly_command %1,%0":"=constraint" (result)
- :" constraint" (argument),
- " constraint" (argument),
- ...
- :" clobbered_reg", "clobbered_reg" ...
- );
-
- The assembly instruction is specified with aliases allowing
- to add names of C variables into it. it also allows to specify
- the CPU registers the command, we are inserting, might clobber.
- It is particularly important if we are expecting to compile with
- optimizing option or have register variables in the code. We
- don't want the program trying to use contents of a register we
- just destroyed by inline assembly instruction. If we are specifying
- the clobbered registers, on the other hand, compiler would make
- sure that no useful information is in this registers at the
- moment inline assembly code starts to execute.
-
- WATCOM C compiler has different provision for inline assembly:
-
- return_type c_function(parameter1,parametr2, ...);
- #pragma aux c_fucntion= \
- "assembly_command" \
- "assembly_command" \
- parm [reg] [reg] \
- value [reg] \
- modify [reg reg];
-
- It is implemented by providing sort of a pseudo c-function which
- is treated by the compiler pretty much like a macro. "parm" specifies
- which registers take in parameters from function's argument list,
- "value"- from what register, if any, the return value is taken and
- "modify", specifies clobbered registers.
-
- for DJGPP compiler code substituting line rendering loop:
-
- for(;beg<=end;beg++) *adr++=colour; /* rendering single line */
-
- may look like:
-
- asm("movl %0,%%ecx" :: "g" (end):"%ecx");
- asm("subl %0,%%ecx" :: "g" (beg):"%ecx");
- asm("movl %0,%%edi" :: "g" (adr):"%edi");
- asm("movl %0,%%eax" :: "g" (colour):"%eax");
- asm("cld"); /* increment when stosb */
- asm("rep");
- asm("stosb %al,(%edi)"); /* rendering line here */
-
- for WATCOM:
-
- void a_loop(unsigned char *adr,int end,int beg,int colour);
- #pragma aux a_loop= \
- "sub ecx,esi" \
- "cld" \
- "rep stosb" \
- parm [edi] [ecx] [esi] [al];
-
- Not a terribly complicated code to write but something giving
- considerable speed increase. Then again we might have used a C
- library function memset for filling chunks of memory:
-
- void *memset(void *s,int c,size_t n);
-
- memset(l_adr,colour,end-beg); /* filling scan line */
-
- This function might have had just that code, we wrote, inside of
- it. let's not believe assumptions neither pro nor con and go check
- that out. with DJGPP or, in this matter, most UNIX compilers we can
- just extract and disassemble memset code from libc.a (standard
- c library) and see for ourselves. let's go:
-
- ar -x libc.a memset.o
- objdump -d memset.o
-
- We would see something like:
-
- memset.o: file format coff-go32
-
- Disassembly of section .text:
- 00000000 <_memset> pushl %ebp
- 00000001 <_memset+1> movl %esp,%ebp
- 00000003 <_memset+3> pushl %edi
- 00000004 <_memset+4> movl 0x8(%ebp),%edi
- 00000007 <_memset+7> movl 0xc(%ebp),%eax
- 0000000a <_memset+a> movl 0x10(%ebp),%ecx
- 0000000d <_memset+d> jcxz 00000011 <_memset+11>
- 0000000f <_memset+f> repz stosb %al,%es:(%edi)
- 00000011 <_memset+11> popl %edi
- 00000012 <_memset+12> movl 0x8(%ebp),%eax
- 00000015 <_memset+15> leave
- 00000016 <_memset+16> ret
- 00000017 <_memset+17> Address 0x18 is out of bounds.
- Disassembly of section .data:
-
- Very similar to what we wrote? Other compilers might have
- worse library code? it might be so, let's try WATCOM C,
- let's go:
-
- wlib clibs.lib *memset
- wdisasm memset.obj
-
- that's what we would see:
-
- Module: memset
- Group: 'DGROUP' CONST,CONST2,_DATA,_BSS
-
- Segment: '_TEXT' BYTE USE32 00000023 bytes
- 0000 57 memset push edi
- 0001 55 push ebp
- 0002 89 e5 mov ebp,esp
- 0004 8b 45 10 mov eax,+10H[ebp]
- 0007 8b 4d 14 mov ecx,+14H[ebp]
- 000a 8b 7d 0c mov edi,+0cH[ebp]
- 000d 06 push es
- 000e 1e push ds
- 000f 07 pop es
- 0010 57 push edi
- 0011 88 c4 mov ah,al
- 0013 d1 e9 shr ecx,1
- 0015 f2 66 ab repne stosw
- 0018 11 c9 adc ecx,ecx
- 001a f2 aa repne stosb
- 001c 5f pop edi
- 001d 07 pop es
- 001e 89 f8 mov eax,edi
- 0020 c9 leave
- 0021 5f pop edi
- 0022 c3 ret
-
- No disassembly errors
-
- ------------------------------------------------------------
-
- Is it not very similar to what we just did using inline assembly?
- WATCOM C code is even trying to be smarter then us by storing as
- much as it can by word store instruction, since every word store
- in this case is equivalent to two char stores, there should be
- twice less iterations. Of course there would be some time spent
- on the function call itself but most likely performance of the
- inline assembly code we wrote and memset call would be very very
- similar. It is just yet another example of how easy it might be
- in some cases to achieve results by very simple means instead of
- spending sleepless hours in front of the glowing box (unfortunately
- only sleepless hours teach us how to do things by simple means but
- that's the topic for another story only marginally dedicated to
- 3-D graphics).
-
- By the way, in the provided source code there is an independent
- approach to fast memory moves and fills. That is functions to perform
- those are brought out into hardware interface as: HW_set_char ;
- HW_set_int; HW_copy_char; HW_copy_int. And their implementation
- may actually be either of the ways described above depending on
- the particular platform.
-
-
- Shaded polygons.
- ----------------
-
- Shaded polygons are used to smooth appearance of faceted models,
- that is those where we approximate real, curved surfaces by a
- set of plane polygons. Interpolative or Gouraud shading (Gouraud, 1971),
- we would discuss, is the easiest to implement, it is also not very
- expensive in terms of speed. The idea is that we carry a colour
- intensity value in every vertex of the polygon and linearly interpolate
- those values to find colour for each pixel inside the polygon.
-
- Finally this is the place where implemented N-dimensional scanning
- procedure comes in handy. Indeed colour intensity can be pretty
- much handled as just an extra dimension as far as edge scanning
- and clipping are concerned.
-
- *I3
- / \
- / \
- I0* \
- \ *I2
- \ I /
- IT1*---*---*IT2
- \ /
- \ /
- \ /
- *I1
-
- pic 3.9
-
- We can obtain values on the left and right boarders of a polygon (IT1,
- IT2 on pic 3.9) by interpolating colour intensity value in every edge
- during edge scanning, just as "start/end' X values were found for every
- horizontal line. Afterwards when rendering certain horizontal scan line we
- can further interpolate "start" and "end" intensities for this line finding
- colour for pixels belonging to it (I on pic 3.9). Fixed point math is
- probably a very convenient way to implement this interpolation:
-
- void G_shaded_polygon(int *edges,int length)
- {
- int new_edges[G_MAX_SHADED_TUPLES]; /* verticaly clipped polygon */
- int new_length;
- register unsigned char *l_adr,*adr;
- register int beg,end,i;
- register signed_32_bit cur_c,inc_c; /* current colour and it's inc */
-
- GI_boarder_array_init(); /* initializing the array */
-
- new_length=C_polygon_x_clipping(edges,new_edges,3,length);
-
- for(i=0;i<new_length;i++)
- GI_scan(&new_edges[i*3],3,3); /* Searching polygon boarders */
-
- if(G_miny<=G_maxy) /* nothing to do? */
- {
- l_adr=G_buffer+G_miny*HW_SCREEN_X_SIZE; /* addr of 1st byte of 1st line */
-
- for(; G_miny<=G_maxy; G_miny++,
- l_adr+=HW_SCREEN_X_SIZE) /* rendering all lines */
- {
- adr=l_adr+(beg=G_start[G_miny]); /* addr of the current line */
- end=G_end[G_miny]; /* ends here */
-
- cur_c=G_0_start[G_miny]; /* colour starts with this value */
- inc_c=G_0_end[G_miny]-cur_c;
- if(end>beg) inc_c/=end-beg+1; /* f.p. increment */
-
- for(;beg<=end;beg++) /* render this line */
- {
- *adr++=cur_c>>G_P; /* rendering single point */
- cur_c+=inc_c; /* incrementing colour */
- }
- }
- }
- }
-
- As you see code above looks not that much different from ambient polygon
- rendering source.
-
- There is one small catch. I said that intensities can be handled pretty
- much as an extra dimension. In fact we can consider shaded polygon on
- the screen to take 2 space dimensions, and to have one colour dimension.
- But would all vertexes of this polygon belong to one plane in such 3-D space?
- If they would not, shading would look rather weird especially when this
- polygon would start to rotate. The common solution is limit polygons to just
- triangles. Why? well three points always belong to the same plane in
- 3-D space.
-
-
-
- Textured polygons.
- ------------------
-
- It might seem that we can extend interpolation technique used to
- calculate colour intensity for shading, onto texture rendering. Just
- keep texture X and Y in every vertex of the polygon, interpolate
- those two along edges and then along horizontal lines obtaining
- this way texture (X,Y) for every pixel to be rendered. This however
- does not work... Or to be more exact it stops working when perspective
- projection is used, the reason is simple: perspective transformation
- is not linear.
-
- One may ask, why did it work for interpolative shading, after all
- we were using perspective transformation all along? The answer is
- it did not work, but visual aspect of neglecting perspective effect
- for shading is quite suitable, neglecting it for texture rendering
- however is not. I believe it has to do with different image frequencies.
-
- Just consider the following situation: a rectangular polygon is
- displayed on the screen so that it's one side is much closer
- to us then the other one, which is almost disappearing in the
- infinity: where do you think in the texture map lines A,B,C and D
- would be mapped from by linear interpolating method? how do you think
- the texture on the polygon would look like?
-
-
- +\ +--/---/+1
- A|--\ A|/ / |
- B|----\ B|/ |
- C|----/ C|\ <------ Missed area
- D|--/ D|\ \ |
- +/ +--\---\+2
-
- Polygon on the Texture
- Screen Map
-
-
- pic 3.10
-
- Well it would look like if someone has carved triangular area
- marked "missed" on the picture above, and connected points 1 and
- 2 together, real ugly... In less dramatic cases texture rendered this
- way might look nicer, being perfect when effect of perspective
- transformation is negligible: all points are at almost the same
- distance from the viewer, or very far away from the viewing plane.
-
- To get always nice looking results we would have to consider
- what is really happening when texture is being rendered:
-
-
- u^ Texture Space
- | *T(u,v)
- +---->
- o v
-
- y ^
- | z
- | U \ /*W(x,y,z)/ V World Space
- | / \ /
- | / \ /
- |/ * O
- +--------------> x
-
- j^
- |
- | *S(i,j) Screen Space
- |
- +------->i
-
-
- pic 3.11
-
- Consider three spaces pictured above: the texture space, world space (the
- one before projection) and finally contents of the screen - screen or image
- space. We may think of them just as original polygon/texture map in case
- of texture space, polygon/texture map turned somehow in case of world
- space and finally same projected onto the screen.
-
- If we know mapping of origin and two orthogonal vectors from texture
- space into the world space, we can express mapping of any point through
- them:
-
- x = Ox + v*Vx + u*Ux
- y = Oy + v*Vy + u*Uy
- z = Oz + v*Vz + u*Uz
-
- Where Vx...Uz are corresponding components of the respective vectors.
- Further point in the world space W(x,y,z) is being perspectively
- transformed into the screen space S(i,j):
-
- i = x / z
- j = y / z
-
- (the actual transformation used, would most likely also contain a
- multiplier - focus, but let's worry about particularities on some later
- stage). In order to perform mapping, on the other hand, we need u,v
- texture coordinates as function of screen coordinates i,j
-
- x = i * z
- y = i * z
-
- or:
- Ox + v*Vx + u*Ux = i * [ Oz + v*Vz + u*Uz ]
- Oy + v*Vy + u*Uy = j * [ Oz + v*Vz + u*Uz ]
-
- trying to express u,v through i,j:
-
- v*(Vx-i*Vz) + u*(Ux-i*Uz) = i*Oz - Ox
- v*(Vy-j*Vz) + u*(Uy-j*Uz) = j*Oz - Oy
-
- further:
- (i*Oz - Ox) - u*(Ux - i*Uz)
- v = -----------------------------
- (Vx - i*Vz)
-
- (i*Oz - Ox) - v*(Vx - i*Vz)
- u = -----------------------------
- (Ux - i*Uz)
- and:
-
- (i*Oz - Ox) - u*(Ux - i*Uz)
- ----------------------------- * (Vy - j*Vz) + u*(Uy - j*Uz) = j*Oz - Oy
- (Vx - i*Vz)
-
- (i*Oz - Ox) - v*(Vx - i*Vz)
- v*(Vy - j*Vz) + ----------------------------- * (Uy - j*Uz) = j*Oz - Oy
- (Ux - i*Uz)
-
- or in nicer form:
-
- i*(Vz*Oy-Vy*Oz) + j*(Vx*Oz-Vz*Ox) + (Vy*Ox-Vx*Oy)
- u = --------------------------------------------------
- i*(Vy*Uz-Vz*Uy) + j*(Vz*Ux-Vx*Uz) + (Vx*Uy-Vy*Ux)
-
- i*(Uy*Oz-Uz*Oy) + j*(Uz*Ox-Ux*Oz) + (Ux*Oy-Uy*Ox)
- v = --------------------------------------------------
- i*(Vy*Uz-Vz*Uy) + j*(Vz*Ux-Vx*Uz) + (Vx*Uy-Vy*Ux)
-
- At this point we have formulas of from where in the texture space the
- point in the screen space originates. There are several implementational
- problems, first few are simple, the actual perspective transformation
- is i=x*focus/z rather then i=x/z, plus, we are performing screen centre
- translation so that screen space (0,0) is in the middle of the screen
- and not in the upper left corner. Hence original dependency should have
- been:
-
- i = (x * focus) / z + HW_SCREEN_X_CENTRE
- j = (y * focus) / z + HW_SCREEN_Y_CENTRE
-
-
- And so finally, so called, reverse mapping formulas have to be amended
- respectively:
-
- ((i-HW_SCREEN_X_CENTRE)/focus) * (Vz* ...
- u = -------------------------------------------
- ...
-
- ...
-
-
- Another, a bit difficult part has to do with texture scaling.
- If the size of texture vectors was picked to be 1 that would
- assume no scaling, that is unit size along the polygon in the
- world space would correspond to unit size in the texture space.
- What we however want to do in a lot of instances is to map
- smaller texture onto a bigger polygon or the other way around,
- bigger texture onto a smaller polygon. Let's try scaling the
- texture space:
-
- T
- *---------->
- v| * | S
- | | \ *--------------------->
- | | \ | v'| * | v T
- | *-- ---* | | \ ----- = -----
- V- - - - - | | | \ | v' S
- | | \
- | |mapping \ | v'*T
- | |of the \ v = -----
- | |polygon \ | S
- | *---------------*
- V- - - - - - - - - - -|
-
- pic 3.12
-
-
-
- now we can put expression for v into direct mapping equations:
-
-
- Vx*T Ux*T
- x = Ox + v'* ------ + u'* ------
- S S
-
- ...
-
- Vx and Ux multiplied by T in fact are just projections of vectors
- enclosing the hole texture space, let's call them V'x == Vx*T,
- U'x == Ux*T etc.
-
-
- V'x U'x
- x = Ox + v'* ----- + u'* -----
- S S
- ...
-
-
- considering this reverse mapping equations would look like:
-
-
- ((i-HW_SCREEN_X_CENTRE)/focus) * (V'z* ...
- u'= S * --------------------------------------------
- ...
-
- ...
-
- This would allow us to scale texture on the polygon by changing S
- multiplier.
-
- Another problem has to do with point O - mapping of the origin of the
- texture space into the world space. The thing is, if the mapping of this
- point gets onto Z==0 plane mapping equations above no longer hold. Just
- due to the fact that perspective transformation is having singularity
- there. In general we deal with this problem of perspective transformation
- by clipping the polygon against some plane in front of Z==0. And do we
- really need mapping of exactly O of the texture space? Not necessarily,
- it may be any point in texture space assuming we change a bit direct
- mapping equations:
-
-
- x' = Ox + (v'-v'0)*V'x + (u-u0)*U'x
- ...
-
-
- where (v'0,u'0) are coordinates in the texture space of the point we
- have a mapping for in the world space that is (Ox,Oy,Oz). And the
- reverse mapping equations would be then:
-
- ((i-HW_SCREEN_X_CENTRE)/focus) * (V'z* ...
- u'= S * ------------------------------------------- + u'0
- ...
-
- ...
-
- How can we obtain a mapping of, now, any point from the texture
- space into the world space? If we would associate with every
- polygon's vertex besides just x,y,z also u,v of where this vertex
- is in the texture, we can treat that as 5-D polygon and perform
- clipping on it, obtaining vertexes with coordinates suitable
- for perspective transformation, hence any of them would be fine
- for the mapping equations. (How to do clipping on polygons is
- covered in the next chapter).
-
- Last thing, in order to render regular patterns, those repeating
- themselves, we may want to introduce texture size parameter,
- and make sure that if we want to access texture outside this
- size, this access would wrap around to pick a colour from somewhere
- inside the texture. This can be easily done if texture size
- is chosen to be some power of 2, for in this case we can
- perform range checking with just logical "and" operation.
-
- However, using above equations for texture mapping appear to be
- quite expensive, there are few divisions involved per each rendered
- pixel. How to optimize that? The, perhaps, easiest way is: horizontal
- line subdivision, that is calculating real texture mapping every N pixels
- and linearly interpolating in between, this method is used in the
- implementation below:
-
-
- void G_textured_polygon(int *edges,int length,int *O,int *u,int *v,
- unsigned char *texture,int log_texture_size,
- int log_texture_scale
- )
- {
- int new_edges[G_MAX_SHADED_TUPLES]; /* verticaly clipped polygon */
- int new_length;
- signed_32_bit Vx,Vy,Vz;
- signed_32_bit Ux,Uy,Uz; /* extracting vectors */
- signed_32_bit x0,y0,z0;
- signed_32_bit ui,uj,uc;
- signed_32_bit vi,vj,vc;
- signed_32_bit zi,zj,zc; /* back to texture coeficients */
- signed_32_bit v0,u0;
- signed_32_bit xx,yy,zz,zzz;
- int xstart,xend;
- signed_32_bit txstart,tystart;
- signed_32_bit txend,tyend;
- signed_32_bit txinc,tyinc;
- register unsigned char *g_adr,*adr;
- register int i,x,y;
- signed_32_bit txtrmasc=(0x1<<(log_texture_size+G_P))-0x1;
- log_texture_scale+=G_P;
-
- GI_boarder_array_init(); /* initializing the array */
-
- new_length=C_polygon_x_clipping(edges,new_edges,4,length);
-
- for(i=0;i<new_length;i++)
- GI_scan(&new_edges[i*4],2,4); /* Searching polygon boarders */
-
- if(G_miny<=G_maxy) /* nothing to do? */
- {
- x0=O[0]; y0=O[1]; z0=O[2];
- u0=O[3]<<G_P; v0=O[4]<<G_P; /* world point <-> texture point */
-
- Vx=v[0]; Vy=v[1]; Vz=v[2];
- Ux=u[0]; Uy=u[1]; Uz=u[2]; /* extracting vectors */
-
- ui=(Vz*y0)-(Vy*z0);
- uj=(Vx*z0)-(Vz*x0);
- uc=(Vy*x0)-(Vx*y0);
- vi=(Uy*z0)-(Uz*y0);
- vj=(Uz*x0)-(Ux*z0);
- vc=(Ux*y0)-(Uy*x0);
- zi=(Vy*Uz)-(Vz*Uy);
- zj=(Vz*Ux)-(Vx*Uz);
- zc=(Vx*Uy)-(Vy*Ux); /* back to texture */
-
- g_adr=G_buffer+G_miny*HW_SCREEN_X_SIZE; /* addr of 1st byte of 1st line */
-
- for(; G_miny<=G_maxy; G_miny++,
- g_adr+=HW_SCREEN_X_SIZE) /* rendering all lines */
- {
- xstart=G_start[G_miny];
- adr=g_adr+xstart;
- xstart-=HW_SCREEN_X_CENTRE;
- x=xstart;
- xend=G_end[G_miny]-HW_SCREEN_X_CENTRE;
- y=G_miny-HW_SCREEN_Y_CENTRE;
-
- xx=((y*uj)>>T_LOG_FOCUS)+uc;
- yy=((y*vj)>>T_LOG_FOCUS)+vc;
- zz=((y*zj)>>T_LOG_FOCUS)+zc; /* valid for the hole run */
-
- if(( zzz=zz+((x*zi)>>T_LOG_FOCUS) )!=0)
- {
- txend=( (xx+ ((x*ui)>>T_LOG_FOCUS)) <<log_texture_scale )/zzz +u0;
- tyend=( (yy+ ((x*vi)>>T_LOG_FOCUS)) <<log_texture_scale )/zzz +v0;
- }
-
- for(;xstart<xend;)
- {
- x+=G_LINEAR; if(x>xend) x=xend; /* size of linear run */
- txstart=txend;
- tystart=tyend;
-
- if(( zzz=zz+((x*zi)>>T_LOG_FOCUS) )!=0)
- {
- txend=( (xx+ ((x*ui)>>T_LOG_FOCUS)) <<log_texture_scale )/zzz +u0;
- tyend=( (yy+ ((x*vi)>>T_LOG_FOCUS)) <<log_texture_scale )/zzz +v0;
- }
-
- length=x-xstart; /* ends here */
- if(length!=0) { txinc=(txend-txstart)/length;
- tyinc=(tyend-tystart)/length;
- }
-
- for(;xstart<x;xstart++) /* linear run */
- {
- txstart&=txtrmasc; tystart&=txtrmasc; /* wrap around */
- *adr++=*(texture+((tystart>>G_P)<<log_texture_size)+(txstart>>G_P));
- txstart+=txinc; tystart+=tyinc; /* new position in the texture */
- }
- }
- }
- }
- }
-
-
-
-
- Other possible speed-up techniques are: area subdivision involving
- 2-D interpolation, faking texture mapping with polynomials (later
- has not very much to do with the true mapping equations described
- here, however) and rendering texture along constant z lines (and
- not along horizontal line) the advantage gained by the former is
- that along every constant Z line perspective transformation is
- linear ( perspective transformation is i=x/z if z==const we
- can write it as i=coef*x where coef=1/z which is linear of course)
-
- The function above might be extended with interpolative shading
- as well. To do that special consideration has to be given to
- palette's layout, that is where and how to keep different intensities
- of the same colour. But once decided coding is quite straightforward.
-
- * * *